Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up HTTP async iterator code #411

Merged
merged 22 commits into from
May 20, 2019
Merged

Clean up HTTP async iterator code #411

merged 22 commits into from
May 20, 2019

Conversation

ry
Copy link
Member

@ry ry commented May 16, 2019

@ry ry force-pushed the http_refactor branch 2 times, most recently from 2663596 to a43e200 Compare May 16, 2019 18:00
@ry ry requested a review from piscisaureus May 16, 2019 18:48
http/server.ts Outdated Show resolved Hide resolved
Copy link
Member

@piscisaureus piscisaureus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs backpressure semantics.

@ry ry force-pushed the http_refactor branch 2 times, most recently from 2c056ba to 554f941 Compare May 17, 2019 04:56
@ry ry force-pushed the http_refactor branch from 554f941 to 1ab72b3 Compare May 17, 2019 05:03
@kevinkassimo
Copy link
Contributor

kevinkassimo commented May 17, 2019

@ry seems with this implementation, there is always going to be either 0 or just 1 element in sendQueue/recvQueue?

@ry
Copy link
Member Author

ry commented May 17, 2019

@kevinkassimo I think so too, but I haven't been able to come up with a simplification.

@piscisaureus
Copy link
Member

I'm sorry but this still creates an unbounded queue.

Imagine multiple physical TCP connections coming in while no requests are being processed.
They're all going to call .send() on the latch which makes that queue grow without backpressure.

@piscisaureus
Copy link
Member

I'll write something myself

@piscisaureus
Copy link
Member

See 29f3e8e
Can still be cleaned up a lot further. A particular issue is that there's no "clean" way for a keepalive tcp connection to close.
Just showing what the general pattern should be.

http/server.ts Outdated Show resolved Hide resolved
http/server.ts Outdated Show resolved Hide resolved
http/server.ts Outdated Show resolved Hide resolved
http/server.ts Outdated Show resolved Hide resolved
http/server.ts Outdated Show resolved Hide resolved
@piscisaureus
Copy link
Member

I realize there is one big issue with this, which is that one tcp connection can block processing of others.

@piscisaureus
Copy link
Member

... which has been fixed now.

@ry
Copy link
Member Author

ry commented May 18, 2019

Looks fine. I’m slightly worried it doesn’t work when you respond to a request out of order? But I think we have tests for that...

@piscisaureus
Copy link
Member

piscisaureus commented May 18, 2019

Looks fine. I’m slightly worried it doesn’t work when you respond to a request out of order? But I think we have tests for that...

That's no longer possible, because no new requests will be accepted on the same connection before the previous one has done responding.
This is also how http works in practice. The server could e.g. respond with 'connection: close', or send a response without a 'content-length' header, and it wouldn't even be valid for the client to send another request on the same connection, so clients always have to wait for a response before sending a new request.

Copy link
Member

@piscisaureus piscisaureus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zekth
Copy link
Contributor

zekth commented May 18, 2019

Is this PR may fix the issue of malformed or encrypted mime headers?
https://github.com/denoland/deno_std/pull/411/files#diff-2e4c2adf52a4573569ea4cebb26b523eR235

Returning an error 400 if the request can't be properly read.

@piscisaureus
Copy link
Member

@zekth I don't know. Is there a test for this issue?

@zekth
Copy link
Contributor

zekth commented May 18, 2019

@zekth I don't know. Is there a test for this issue?

Not atm because we use Headers in test and not raw HTTP which don't give use the flexibility of sending anything we want in the request.

But i can write some like done in : https://github.com/denoland/deno_std/blob/master/http/racing_server_test.ts

For ref: denoland/deno#2346

@kevinkassimo
Copy link
Contributor

kevinkassimo commented May 18, 2019

This PR results in a 1/3 reduction of req/sec:

master:

Running 10s test @ http://localhost:4500
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   353.40us  153.27us   2.88ms   95.73%
    Req/Sec    14.48k   692.05    15.67k    68.32%
  290989 requests in 10.10s, 13.88MB read
Requests/sec:  28809.59
Transfer/sec:      1.37MB

this branch:

Running 10s test @ http://localhost:4500
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    24.05ms   61.05ms 353.56ms   88.58%
    Req/Sec    11.01k     4.44k   15.41k    81.50%
  195783 requests in 10.05s, 9.34MB read
Requests/sec:  19489.73
Transfer/sec:      0.93MB

Latency also seems exceptionally problematic, effectively x100.
I believe this is somehow related to we removed pipelining code? Not so sure if wrk does pipelining...

@piscisaureus
Copy link
Member

@kevinkassimo pipelining hasn't been removed but I'll look into it.

@piscisaureus
Copy link
Member

@kevinkassimo Can you try again? I think the latest patch should remove the unfair scheduling that was introduced by the MuxAsyncIterator and restore overall performance.

However I'm still seeing pretty bad max-latency values. Note that it's strictly the maximum latency that is quite bad. The overall distribution (the one you get with wrk http://localhost:4500 --latency) is actually looking quite good. So there's some request somewhere that's taking a bit longer. I suspect it may be that this new code might effectively accept 10 connnections first before servicing any requests on them, which makes that the first request takes a bit of a hit. But I'm not sure.

@piscisaureus
Copy link
Member

It could also be due to v8 doing some optimizing after the benchmark has started. I'm seeing much better max-latency numbers on the second and subsequent runs.

@kevinkassimo
Copy link
Contributor

@piscisaureus Seems performing better than master now.

This branch:

./third_party/wrk/mac/wrk -d 10 http://localhost:4500
Running 10s test @ http://localhost:4500
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   285.13us  185.29us   4.83ms   94.48%
    Req/Sec    18.73k   799.01    20.61k    74.75%
  376607 requests in 10.10s, 17.96MB read
Requests/sec:  37284.09
Transfer/sec:      1.78MB

@kevinkassimo
Copy link
Contributor

kevinkassimo commented May 20, 2019

There is another small issue I noticed: it seems under 500 concurrent connections our HTTP server would die silently... Might be unrelated to this PR though:

$ ./third_party/wrk/mac/wrk -c 500 -d 10 http://localhost:4500
Running 10s test @ http://localhost:4500
  2 threads and 500 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    14.34ms   28.00ms 535.90ms   98.24%
    Req/Sec    17.87k     2.44k   22.63k    74.00%
  355802 requests in 10.01s, 16.97MB read
  Socket errors: connect 0, read 446, write 13, timeout 0
Requests/sec:  35534.95
Transfer/sec:      1.69MB
$ ./third_party/wrk/mac/wrk -c 500 -d 10 http://localhost:4500
Running 10s test @ http://localhost:4500
  2 threads and 500 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     0.00us    0.00us   0.00us     nan%
    Req/Sec     0.00      0.00     0.00       nan%
  0 requests in 10.10s, 0.00B read
  Socket errors: connect 0, read 485, write 14, timeout 0
Requests/sec:      0.00
Transfer/sec:       0.00B

(Actually on master this would result in a panic on

thread '<unnamed>' panicked at 'bad rid', ../../cli/resources.rs:237:15

Not so sure why under this PR this panic does not occur but the server still silently dies)

@piscisaureus
Copy link
Member

@ry I think async test may not actually be running

@piscisaureus
Copy link
Member

@kevinkassimo I think we may not be dealing with EMFILE errors properly. What happens if you increase the ulimit on your system?

@piscisaureus
Copy link
Member

Or running out of ephemeral ports.

@ry
Copy link
Member Author

ry commented May 20, 2019

I will land this branch now and update the submodule in master: denoland/deno#2378

@ry ry merged commit 68faf32 into denoland:master May 20, 2019
ry added a commit to ry/deno that referenced this pull request Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants